# Inference optimization
Llama 3.1 Nemotron Nano 4B V1.1 GGUF
Other
Llama-3.1-Nemotron-Nano-4B-v1.1 is a large language model optimized based on Llama 3.1, achieving a good balance between accuracy and efficiency. It is suitable for various scenarios such as AI agents and chatbots.
Large Language Model
Transformers English

L
Mungert
2,177
1
Qwq 32B FP8 Dynamic
MIT
FP8 quantized version of QwQ-32B, reducing storage and memory requirements by 50% through dynamic quantization while maintaining 99.75% of the original model accuracy
Large Language Model
Transformers

Q
nm-testing
3,895
3
Featured Recommended AI Models